List of Flash News about AI inference cost
| Time | Details |
|---|---|
|
2025-12-10 17:15 |
Andrej Karpathy Benchmarks GPT-5.1 Thinking API on 930 Hacker News Threads: 3 Hours Build, 1 Hour Run, $60 Cost
According to @karpathy, he used the GPT-5.1 Thinking API to auto-grade all 930 December 2015 Hacker News frontpage article-discussion pairs to identify the most and least prescient comments, taking about 3 hours to write the code and roughly 1 hour and $60 to run, source: twitter.com/karpathy/status/1998803709468487877 and karpathy.bearblog.dev/auto-grade-hn. According to @karpathy, the project repository is available at github.com/karpathy/hn-time-capsule and the full results are browsable at karpathy.ai/hncapsule, source: twitter.com/karpathy/status/1998803709468487877. According to @karpathy, he emphasized in-hindsight analysis as a practical way to train forward prediction models and noted that future LLMs will perform such work cheaper, faster, and better, source: twitter.com/karpathy/status/1998803709468487877. According to @karpathy, the top 10 most prescient HN accounts for that month were pcwalton, tptacek, paulmd, cstross, greglindahl, moxie, hannob, 0xcde4c3db, Manishearth, and johncolanduoni, source: twitter.com/karpathy/status/1998803709468487877. According to @karpathy, these run-time and cost figures provide a concrete real-world datapoint for large-scale LLM evaluation workflows using GPT-5.1 Thinking, anchored at approximately $60 for a 930-thread pass in about one hour, which traders tracking AI infrastructure efficiency can use as a benchmark, source: twitter.com/karpathy/status/1998803709468487877 and karpathy.bearblog.dev/auto-grade-hn. |
|
2025-08-15 17:28 |
Nic Carter: AI Inference Costs Deflating Fast — $200/mo Could Halve in 6 Months, Trading Implications for AI SaaS and Crypto Tokens
According to @nic__carter, AI inference costs are falling 10–1000x annually, implying that a $200/month AI subscription could drop by roughly half within six months as VC subsidies bridge near-term pricing (source: @nic__carter on X, Aug 15, 2025). For traders, this points to price compression risk across AI SaaS and model API vendors, warranting conservative revenue and ARPU assumptions in the near term (source: @nic__carter). The broader downtrend is corroborated by vendor pricing, as OpenAI launched GPT-4o at $5 per 1M input tokens and $15 per 1M output tokens in May 2024, below prior GPT-4 Turbo levels, reinforcing rapid cost deflation in the stack (source: OpenAI, May 13, 2024). In crypto markets, lower inference costs could expand adoption of onchain AI agents and data feeds while pressuring compute-linked revenue models, prompting a valuation re-rate toward usage-driven growth over pure pricing power, based on Carter’s cost trajectory (source: @nic__carter). Near-term trade tilt: prioritize adoption and volume beneficiaries over high-price narratives until pricing stabilizes, given Carter’s projected deflation path (source: @nic__carter). |